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Chapter 1 

Introduction 



Continuations control program flow using purely functional means. Informally, a contin- 
uation is a function representing the rest of the program: when passed an intermediate 
result (a value in a functional language, a store in an imperative language), the function 
"continues" the computation to the final result. In LISP programs, for example, the control 
stack can be thought of as representing the continuation of a program: the stack tells the 
interpreter how to continue the computation to the final answer. At a lower level, a program 
counter also represents a continuation, although the "function" may not be very clear. 

The explicit use of continuations pervades the theory and practice of programming 
languages. Continuations first appeared in continuation-style semantics for imperative lan- 
guages [11, 30, 31]. In this style, continuations are explicitly passed to the meanings of all 
program statements. The meaning of imperative statements can be modeled as functions 
that change the continuation. For example, in an ALGOL-iike language with goto <label> 
statements, each label marks a particular continuation. The meaning of the statement 
goto <label> is one that, upon receiving a store and a continuation, discards that contin- 
uation and passes the store to the continuation associated with <label>. Highly imperative 
constructs like goto are difficult or impossible to represent in "direct" semantics in which 
statements are modeled as functions from answers to answers [11, 30]. 

Continuations appear in at least two other settings. In languages such as LISP and 
Scheme, the continuation of a program may be accessed through the control operator 
call-with-current-continuation (call/cc) [23]. The programmer may then use the 
continuation to repeat certain calculations, perform error traps, backtrack through a com- 
putation, or simulate forks and joins [10]. Continuations have also been used in compilers 
for languages such as Scheme and ML. These compilers apply a continuation-passing style 
(cps) transform as a fundamental step in compilation [1, 9, 28]. 

Each of the three settings involves "programming" with continuations, and it is almost 
self-evident that this requires a different style of thinking. What is not obvious, however, 
is whether working in a continuation setting requires new reasoning tools. Indeed, certain 
principles should remain valid in the context of continuatic ns. For example, the substitution 
of actual parameters for formal parameters in procedure calls should not become invalid 
otherwise, the addition of continuations would change the programming language in drastic 
ways! 

On the other hand, the mere addition of continuation-based control operators to lan- 
guages suggests that continuations change programming in a fundamental way. In the 
presence of control operators, a programmer may be able to distinguish pieces of code that 
were indistinguishable without control operators, making the language more powerful. One 
can make similar arguments for the other two settings. For instance, programs not ex- 
pressible when programming directly in the language become expressible when using cps 



converted code. 

This thesis attempts to make precise the intuition that continuations "change things" 
in the three settings of continuations. Using specific counterexamples, we shall prove that 
certain familiar reasoning principles are unsound in the three settings of continuations. In 
essence, reasoning about code in the usual way may lead one to draw faulty conclusions 
about the behavior of that code. By understanding the failure of reasoning principles in 
each of the three settings of continuations, we move closer to understanding continuations 
themselves; insights generated by the examples will help in building a suitable theory of 
continuations. 

1.1 Reasoning about Code 

By "reasoning principles" we mean principles for proving equivalences of code. Such prin- 
ciples capture the notion of "behavior of code." For example, a A-abstraction applied to an 
integer argument in LISP behaves the same (ignoring efficiency issues) as the body of the 
abstraction with the integer in place of the abstracted variable. These two pieces of code are 
equivalent, and the definition of a LISP interpreter may be used to verify this equivalence. 
Two pieces of code are "equivalent" if they produce the same "outcomes" under the 
interpreter. To make this more precise, we must define the observations, the net outcomes 
of the interpreter considered important. Typically, we choose to observe terms at which the 
interpreter stops. In the language A„ defined in Chapter 2, we will observe evaluation to 
numerals. 1 Let Eval(M) be a partial function from terms to terms, representing the output 
of the interpreter on terms; we then say 

Definition 1.1 (Informal) Two terms M and N are observationally equivalent if 

Eval(M) and Eval(N) agree on all observations. 

Two programs are observationally equivalent if they produce the same observable results. 
Observational equivalence states that two terms as g ven cannot be told apart by the 
interpreter. For languages with functional terms, observational equivalence is too coarse; 
one may still be able to distinguish two observationally equivalent terms. For instance, if we 
choose to observe "termination of the interpreter" in LISP, any two A-abstractions would 
agree on all observations and hence would be considered observationally equivalent. Yet 
a programmer may be able to distinguish two A-abstractions by writing a context (a term 
with a hole) that makes the terms evaluate to different observations. One may formalize 
this ability to distinguish terms: 

Definition 1.2 (Informal) Two terms M and N are observationally distinguishable 
iff for some context C[-\, C[M] and C[N] differ on some observation (in other words, are 
not observationally equivalent.) 

The complementary notion is, in fact, more important: 

Definition 1.3 (Informal) Two terms M and N are observationally congruent (writ- 
ten M = bs N ) iff they are not observationally distinguishable. 



'More complex observations may result in finer distinctions betv 3en terms; see [4, 17] for an example of 
another reasonable notion of observation. 



Observational congruence is the congruence closure of observational equivalence. 

From a software engineering perspective, observational congruence captures the notion 
of "modularity" of code. For example, two routines that -'sort" should be observationally 
congruent: the "sort" routines should be interchangeable in any program, and the program 
should produce the same answers using either routine. Observational congruence also pro- 
vides one definition of a "correct" compiler optimization: if one piece of code is replaced by 
a faster yet observationally congruent piece, the optimization is "safe," i.e., the optimized 
code will still produce the expected answer. 

When we say "reasoning about code," we mean reasoning used to prove observational 
congruences. In fact, almost any reasoning principle may be viewed as a way to verify 
observational congruences. For instance, fixpoint induction in denotational semantics and 
pure A-calculus-like equational reasoning are reasoning tools for proving congruences. These 
formal reasoning principles help justify the informal observational congruence reasoning 
used by programmers, clarifying common assumptions about the behavior of code. 

1.2 Outline of Thesis 

We concentrate on the setting of cps conversion, since the cps transform seems fundamental 
to understanding the other two settings of continuations, a continuation transform forms 
the basis of many continuation semantics (cf. [24, 26, 30]) and is often used to describe 
the semantics of call/cc-like operators (cf. [7, 8].) Chapter 2 describes a call-by-value 
functional language X v and its continuation transform, both of which are the focus of study. 

In Chapter 3, we describe specific examples that show the failure of reasoning princi- 
ples based on observational congruence. These examples will have the form "M and N are 
observationally congruent but not congruent in one of the continuation settings." In par- 
ticular, we show that two terms may be observationally congruent but their cps-transforms 
may not be. Similar observations are also made for the other two settings of continuations. 

The unsoundness of familiar reasoning principles indicates that a theory of continuations 
remains to be found. Chapter 4 discusses possible directions for such a theory. One method 
(currently being pursued) involves extending the retraction-based method of Meyer and 
Wand [15]. One might also seek results tying the three settings of continuations together. 
Finally, an Appendix is included which contains proofs of "standard" theorems for A„. 



Chapter 2 

The Language and its Continuation Transform 



This chapter defines A„, a call-by-value version of the language PCF [20, 25], including an 
interpreter for A„. A call-by- value continuation transform for the language is then given, 
along with theorems that show the correctness of the transform. 

2.1 Syntax 

The familiar syntax of the simply-typed A-calculus forms the basis of \ v . Each term in A^ 
has a type of the form o or (a -*• r), where o is the sole base type, the type of natural 
numbers, and a ->• r is the type of functions from a to r. 1 The set of terms with their 
corresponding types is defined in Figure 2.1. In this definition and throughout the text, 
Greek letters (with the exception of k, A, and fi) denote types, uppercase Roman letters 



x a : a 

f'-.cr 

C\ . o 

succ, pred : o — ► o 

(cond B M N):o 

(M N) : t 

(Xx a .M) ;a->T 

W-M) : a 



A- variables, where x £ C 
\i~ variables, where / € M. 
numerals (/ > 0) 
functional constants 
conditionals, where B,M,N : o 
applications, where M : a — >■ r and N : a 
A-abstractions, where M : r 
recursive definitions, where M : a 



Figure 2.1: The syntax for A„; here, C and M are two disjoint, infinite sets of variables. Each 
variable in A„ is tagged with a type (c/. [20]), but types will often be dropped when the context is 
clear. 

denote terms, the lowercase letters /, g, and h are ^t- variables, and all other letters (e.g., 
k, a, b, c) are A-variables except when otherwise stated. 

The A- and ji- variables occurring in a term may be bound or free [2]. If two terms M 
and N differ only in the names of bound variables, we consider them to be syntactically 
equivalent and write M = N [2]. A term is closed if it contains no free variables; otherwise, 
a term is open. 

Contexts are special terms containing holes. A context C[-] is derived from a term M 
by replacing all free occurrences of some variable in M, say f a , by a hole [•]. C[N] is the 
result of replacing every hole in C[-] with N, where N : a and the type of the hole is a. 



^s is customary, parentheses will frequently be dropped from types with the understanding that 
associates to the right. For example, o-m-»ois short for (o -> (o ^ o)). 
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Figure 2.2: Structured rewrite rules for A„. Substitution of the term N for the variable x in M, 
with the necessary renaming of bound variables, is written M[x := N] (see [2] for a formal definition.) 

2.2 Operational Semantics 

The relation -+„, the one-step reduction relation on terms of A„, is defined in Figure 2.2 
using a structured operational semantics [19, 21]. In reducing applications, operands are 
substituted in for A-bound variables only when the operand is a value. A value (usually 
denoted by V) is a A-abstraction, a constant, or a A- variable. None of these terms can be 
rewritten using -»„, so a value is a term in evaluated form. 2 

It is relatively easy to see from the fact that values are stopped that — ►„ is deterministic. 
This allows us to define an interpreter for X v from ->„. Since A„ is a language for arithmetic, 
we choose the final answers of the interpreter to be numerals. The input to an interpreter 
for A„ should therefore be closed terms of base type which we call complete programs. 
(A complete program is a program coupled with a particular set of inputs.) The reflexive, 
transitive closure of the relation -►„,-»„, can be used to define a partial recursive function 

Eval v : Complete programs -^ Numerals 

/ (M) - { Q if M ~* v ° l 

v ^ \ undefined otherwise 

which is an interpreter for the language. 

In our investigation of the cps transform we will be most interested in reasoning about 
the behavior of code under Eval v . We say that 

Definition 2.1 M observationally approximates N, written M ^ v N , if, for any con- 
text C[-] such that C[M] and C[N] are complete programs, C[M] -»„ c; implies C[N] -»„ c\. 

Two terms M and N are observationally congruent, written M =l bs N,ii M < v N and 
N < V M. 

Observational congruences can be difficult to prove using only the definition [12]. For 
example, consider the terms JV"i = Xx.(Xy.y) c 3 and N 2 = Xx.c 3 . If N t is applied to an 



2 Using this rationale, /j- variables might also be considered values, if it were not for the fact that p- 
variables may be replaced by terms that require further evaluation. For example, / gets replaced by a 
non-value in the reduction pf.f -> v f[f := /*/./]. In contrast, A-variables remain values when reduced and 
hence are considered values. This distinction explains the need for two disjoint sets of variables. Plotkin 
also uses two sets of variables in one version of his metalanguage [22]. 



argument during the evaluation of a program, the "active" subterm at the next stage will 
be (Xy.y) c 3 which will reduce to c 3 . If N 2 appeared as the subterm instead, c 3 will again 
be the result. The terms should thus be congruent. This argument, however, is difficult to 
formalize and is of little use in proving other observational congruences. 

Equational reasoning based on -*„ can be used to prove N x = v obs N 2 . Define the relation 
=„ by replacing all -Vs in the definition of ->„ by =„'s, adding the axioms reflexivity, 
symmetry, and transitivity, and condensing the operation. J rules with antecedents into the 

congruence rule 

M = v M' 

C[M] = v C[M'} 

where C[-] is any context (net necessarily making C[M] a complete program.) The rules of 
=„ are sound for proving observational congruences. 

Theorem 2.2 If M =„ N, then M = v obs N. 

Proof: Delayed to the Appendix. — 

N\ =a bs N 2 now follows from the fact that JVj =„ N 2 . 

The converse to Theorem 2.2 is false: there are terms that are observationally congru- 
ent but cannot be proven equivalent. 3 The following theorem will be useful in verifying 
congruences: 

Theorem 2.3 Let M and N be closed terms of the same type. Then M < v N iff, for all 
vectors V of closed values, M V -»„ V£ implies N V -»„ V{ and V ' = V{ if either is a 
numeral. 

Proof: Delayed to the Appendix. * 

Theorem 2.3 states that applicative contexts determine observational congruence (cf. [3].) 

2.3 Continuation Transform 

2.3.1 Definition 

The continuation transform for A„ is based on a_cps transform appropriate for call-by- value 
[9, 15, 19]. The transform of a term M, written M, is another term of A„. Figure 2.3 defines 
the transform of a term by structural induction on the term. 

The behavior of the interpreter for A„ provides clues to understanding the continuized 
version of a term. Basically, the flow of control is made explicit by the continuations of 
a cps-converted term. For example, since values are not evaluated, the cps transform of a 
value simply passes the value to a continuation (the rest of the program.) For applications 
as well, the continuations in the transform of an application mimic the flow of control in the 
interpreter: the continuation passed to the operator first evaluates the operand and passes 
control to the operand's continuation, which, in turn, applies the operator to the operand. 

The explicit incorporation of continuations requires that the transform change the type 
of a term. A continuized term accepts a continuation as an argument (a function from some 
type to a final answer), and produces a final answer given that continuation. The type of 
final answers for A„ is o, so a term of type o is transformed into a term of type (o -»■ o) -► o. 



3 In fact, observational congruence is not axiomatizable [2, 32], so one cannot hope for an equational proof 
system that captures observational congruence. 
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X" = 

T_ = 

C\ = 

succ = 



pred 



cond B Mq Mi 



(M N) = 



Xx a .M 



\K.f(°'-> ^° K 

Xk.k ci 

Xk.k (\x°.\ki.ki (succ a;)) 
Xk.k (\x°.\ki.ki (predx)) 
AK.5(Am°.cond m (Mo «) (Mi «)) 
\k.~M (Xm^^y.N (Xn a '.m n k)) 
(where M : a —>■ r and N : cr) 
Xk.k (Xx"'.M) 
XK.(nf^'^°^°M) K 



Figure 2.3: The continuation transform for X v . The types of continuations k (which have the form 
a' -+ 6) have been omitted for clarity. Note that variables change types when transformed. 

The situation for higher-typed terms is more complicated. The continuation of a higher- 
order term needs to accept functions which, given a value and another continuation, produce 
final answers. The transform of a term of type a is thus a term of type (a' -^0)^0, where 
a' is defined recursively by (?/. [15]) 



(* 



a' - (r' -> o) -»• o. 



2.3.2 Fundamental Properties of the Transform 

By inspecting the definition of the transform, one may observe that every operand in a 
transformed term is a value and hence need not be evaluated. In other words, transformed 
terms may be evaluated tail-recursively. Tail-recursiveness can lead to increased efficiency. 
A traditional call-by-value interpreter (or code generated by compilers) uses a stack to 
remember the position of the subterm currently being evaluated. In transformed terms, all 
operands in applications are in evaluated form, so an interpreter designed specifically for 
transformed terms does not require a stack. 4 

A corollary to the fact that all operands are values is unambiguous reducibility: call- 
by-name and call-by- value reduction strategies coincide on transformed terms. Unambigu- 
ous reducibility allows one to use the transform to simulate call-by-value in a call-by-name 
interpreter, as is done in [19 . 

Of course, the transform must satisfy correctness properties as well. If one expects 
to use the transform as a first step in compilation, for example, transformed terms must 
not produce different answers than the original terms! The continuation transform for the 
language satisfies two properties that guarantee its correctness: provable equality (i.e., = v ) 
is preserved by the transform and complete programs produce the same output as their 
transformed versions [9, 19]. 



2.3.2.1 Preservation of equational reasoning 

We follow Plotkin's proof in [19] to show that M =„ N implies M =„ N. 



4 One may regard the cps version of a term as incorporating an explicit representation of the interpreter's 
control stack. 



Substitutions performed by =„ pose problems to a d irect proof . S uppose, for example, 
that =„ performs the substitut ion M[x := V] 1 JVe want (Xx.M) V = v M[x := V]. In point 
of fact, it is easy to show that (Xx.M) V = v M[x := &(V)], where 

Definition 2.4 IfV is a vaiue, then \P(V) is defined 

• &(x a ) = x a ' ; 
&(ci) = ci; 

^(succ) = Ax.Aki.ki (succa;); 
^(pred) = Ax.Aki.ki (pred x); 
^(Xx^.M) = Xx a 'M. 

(Essentially, &(V) is V withou t the leadin g c ontinuation .) The following lemma allows us 
to complete the argument that (Xx.M) V =„ M[x := V]: 

Lemma 2.5 IfV is a value and x is a X-variable, then M~[x a ' := &{V)] - M[x° := V]. 

Proof: By structural induction on M. For the base case, M must be a constant or variable: 

Case 1: M = x. Then ~M[x := ¥(V)] = Ak.k V{V) = V = M[x := V]. 






Case 2: M = t for some variable t / x. Then M[x :-- «P(V)] = t = M[x := V]. 
Case 3: M = a for some constant a. Similar to Case 2. 
For the induction case, we also divide into cases depending on the form of M: 
Case 1: M = cond B M M x . Then 

~M[x:=¥(V)] = (Ak.5 (Am.cond m (Mo k) (Mi" k)))[o; := «f(y)] 



Xk.B[x := V] (Am.cond m (M [x := V] k) (Mi [a; := V] k)) 
(by the induction hypothesis) 

= M[x := V]. 

Case 2: M = (Mi M 2 ). Then 

M[x := #(V)] = (Ak.MT (Am.M2 (An.m n k)))[z := V(V)] 

= Xk.Mi[x := V] (Xm.M 2 [x := V] (Xn.m n k)) 
(by the induction hypothesis) 

= M[x : = V). 

Case 3: M = Xy.M' . lfy = x, then M[x := &(V)] = M = M[x := V]. VLy±x, 

M[x := ¥(V)] = (Ak.k (Xy.W))[x := ¥(V)] 
= Ak.k (\y.M>[x := V}) 

(by the induction hypothesis) 

= M[x := V]. 



Case 4: M = fif.M'. We know that x ^ /; so 

M[z := «P(V)] = (Xn.(fi f.W) «)[a; := #(V)] 
= Ak.(^/.M'[» := V]) k 

(by the induction hypothesis) 

= M[x := V]. 

We have exhausted all cases, hence the lemma holds. ■ 

The analog of Lemma 2.5 for recursive definitions works somewhat more easily: 

Lemma 2.6 /// is a fi-variable, then M[f^'^ )^° : = fif(°'^°^°.N] = M[f := nf.N]. 

Proof: By structural induction on M. In the base case, we divide into cases on the form 
of M: 

Case 1: M = /. Then M[f : = fif.N] = Xn.(fif.N) k = fif.N = M[ f := fif.N]. 

Case 2: M = t for some variable t ^ f. Then M[/ := /i/.JV] = * = M t/ : = ^/-^l- 

Case 3: M = a for some constant a. Similar to Case 2. 
For the induction case, there are four cases to consider: 

Case 1: M = cond B M My. Then 

M[/ := fif.N) = (Xk.B (Am .cond m (Mp *Q (Ml «)))[/ : = /* /■#] 

= \n.B[f := fif.N] (Am.cond m (M [f := /x/.JV] k) (Mj[/ := m/-^] «)) 

(by the induction hypothesis) 
= M[f := /z/.JV]. 

Case 2: M = (M a M 2 ). Thus, 

M[/ := fif.N] = (Ak.MT (Am.Mi (An.m n «)))[/ := /i/.F] 

= Ak.Mi[/ := fif.N] (Xm.M 2 [f := A*/-iV] (An.m n k)) 

(by the induction hypothesis) 
= M[/ := /X/.JV]. 

Case 3: M = Xy.M'. Note that / 7^ y; thus 

M[/:=/i/JV] = (Ak.k (X y.W))[f := fif. N] 
= Xk.k (Xy.M'[f := fif.N]) 

(by the induction hypothesis) 

= M[f := fif.N]. 

Case 4: M = ^.M'. If 5 = /, M[/ := /i/JV] = M = M[f := /i/.JV}. On the other 
hand, if g ^ f, 

M[f := fif.N] = (XK.(fi g.W) n ) [f := fif .N] 
= Xn.{fig.M'[f^ fif.N]) k 
(by the induction hypothesis) 

= M[f := fif.N]. 



This concludes the proof. B 

Given these two lemmas, we may complete the proof of the theorem: 
Theorem 2.7 If M = v N, then M = v W. 

Proof: By induction on the length n of the proof that M - v N . In the base case, the 
length of the proof is 1, so an axiom was used: 

Case 1: (Xx.M) V =„ M[x := V], where V is a value. Recall that V = Xk.k &(V). 
Therefore, 

(Xx.M) V = v Xk.(Xk!.ki (Xx.M)) (Xm.V (Xn.m n k)) 

= v Xn.(Xm.V (Xn.m n k)) (Xx.M) 

- v Xk.V (Xn.(Xx.W) n k) 

= v XK.(Xn.(Xx.~M) n k) &(V) 

= v Xk.(Xx.M) &(V) k 

=„ Xk.(M[x := V(V)]) k 

- v Xk.M[x := V] k 

where the last equation follows from Lemma 2.5. Examining the continuation trans- 
form, we note that every continuized term begins with a A- abstraction; thus, 

Xk.M[x := V] k =„ M[x := V] 



so (Xx.M) V =„ M[x := V]. 
Case 2: cond c M Mi = v M . By calculation, 

cond c M M 1 =„ Ak.(Aki.«i c ) (Am.cond m (M k) (Mi k)) 

= v AK.cond c (Mo k) (Mi k) 

= v Ak.(Mo «) -v Mo- 
Case 3: cond c/ + i M Mi = v M\. Similar to the previous case. 
Case 4: succc/ =„ c/+i. By calculation, 

succc; - v Xk.(Xki.Ki (Xx.Xk 2 .k 2 (succ a:))) (Xm.(XK 3 .K 3 c\) (Xn.m n k)) 
- v Ak.(Ak 3 .k 3 c/) (A«.(Aa;.AK2.K:2 (succ a;)) n n) 

= v Xk,.(Xx.XK2.K-2 (SUCC X)) Cl K 

— v Xk.k (succ C/) =u Ak.k C/+1 = „ C/+1. 

Case 5: pred Co =„ cq. Similar to the previous case. 
Case 6: pred cj+i =„ c/. Similar to the previous case. 
Case 7: \if.M = v M[f := fif.M]. By calculation, 

fif.M = v XK.(nf.M) k 

= v \n. (M[f := fif.M} ) k 

=„ Xn.(M[f := fif.M}) k = v M[f := fif.M], 

the third equation following from Lemma 2.6. 
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Case 8: M =„ M. Trivial. 

In the induction case, the length of the proof is n + 1; again, we divide into cases, this 
time depending on the last rule used: 

Case 1: M =„ JV and iV =„ P implies M =„ P. By themduction hypothesis, we know 
that M = V ~N and W = v P, so we can conclude that M = v P by the transitivity rule. 

Case 2: M = v N implies N = v M. Trivial. 

Case 3: M = v N implies: C[M] =„ C[JV]. Using the induc tion hy po thesis M = v N, an 
easy structural induction on the context C[-] shows that C[M] =„ C[JV]. 

This list exhausts the possibilities for last rule used, hence we are done. ■ 



2.3.2.2 Adequacy 

Theorem 2.7 does not explain the correspondence of evaluation of terms and their cps- 
versions. For complete programs in particular, we expect the interpreter to give the same 
answers from both the direct and continuized versions, except that continuized versions 
must be passed a "default continuation," viz., the identity function: 

M -» v c\ iff M (Xx.x) -»►„ c\. 

Indeed, this fact must hold if we wish to use cps conversion in compilers. 5 

The proof proceeds using the method in [19]. The key observation is that certain reduc- 
tions on transformed terms have no corresponding reduction on non-continuized versions. 
For example, consider the complete program c 5 . The direct version cannot be reduced, but 
~c~l {\x.x) can be: 

(Ak.k C5) (Xx.x) -* v (Xx.x) C5 — * v C5. 

The first reduction is callec an administrative reduction, since only a continuation is 
passed. The relation * applies a continuized term to a continuation and performs all possible 
administrative reductions: 

Definition 2.8 For any term M : a and any value K : a' -* o, we define M * K by 

~M*K = K &(M ), if M is a value 
J°*K = /(»'-°)-° k, if f is a fi-variable 

/ cond &(B) (Mi" K) (M2 K) if B is a value 

(cond J 0MiM 2 )*A - J B*(Xm.(con6m(W[K)(M~ 2 K))) otherwise 

M7*(Am.M2 (Xn.m n K)) if Mi is not a value 

M2 •(An.«P(Mi) n K) if Mi, but not M 2 , is a value 

&(Mi) ^(M 2 ) K otherwise 



(MiM 2 )*K = ( 



/if.M'*K = fif.(M' K) 

The following lemma confirms that the definition actually represents a "partial reduction" 
of a continuized term: 



^Note that the =*■ direction fol.ows from Theorems 2.2 and 2.7, but the converse does not follow directly. 
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Lemma 2.9 If K is a value, then M A -» v M * K. 

Proof: By structural induction on M. For the base case, divide into cases depending on 
M: 

Case 1: M = x. Then M A = (Ak.k x) A -»„ A x = ~M + K. 

Case 2: M = /. Then M A = (Xn.f k) K -*„ / K = M*K. 

Case 3: M = c,. Then M A = (Xk.k c/) A' -►,, A" c, = ~M*K. 

Case 4: M = succ. Then M A ->„ A' (Az.Ak.k succ re) = M * A. 

Case 5: M = pred. Similar to the previous case. 
For the induction case, 

Case 1: M = cond 5 Mi M 2 . Then 

M A = (Xk.B (Am.cond m (Ml k) (M2 «))) A 
-►„ 5 (Am.cond m (Mi K) (M~ 2 A)). 

If 5 is a value, then M A -»„ cond <P(5) (Mi~ A) (M> A); otherwise, 

MA -*„ 5* (Am.cond m(M7 A) (M2 A)) = ~M * K 
(by the induction hypothesis.) 

Case 2: M = (M a M 2 ). If M x is not a value, 

M A" = (Ak.MT (Am.Ml (An.m n k))) A" 
->■„ Mi (Am.M 2 (An.m n Jf)) 
-»„ Mi"* (Am.Mi (An.m n A')) = M*K 
(by the induction hypothesis.) 

If Mi but not M2 is a value, 

M A" -»„ M2 (An.^(Mj) n A') 

-»„ M2* (An.l^(Mi) n K) = ~M * K 
(by the induction hypothesis.) 

Finally, if both Mi and M 2 are values, 

M K ^ V W 2 (An.«P(Mi) n A) -»„ <F(Mi) <P(M 2 ) A = M* A. 

Case 3: M = Az.M'. Then 

M A = (Xk.k (Az.M 7 )) K -> v A (Ax.M 7 ) = M * A. 

Case 4: M = /x/.M'. Then 

M A = (Xn.inf.M 7 ) k) A -►„ (/i/.M 7 ) A = M * A. 

This concludes the proof of the lemma. * 

Once the administrative reductions on a continuized term have been performed, the 
next reductions correspond to reductions on the original version of the term: 
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Lemma 2.10 If M -> v N and A is a value, then M-kK -* v N*K. 

Proof: By induction on the length of proof of M -►„ N. In the base case, the length of 

the reduction is 1; we divide into cases depending on the operational rule used: 

Case 1: (Xx.M') V -* v M'[x := V). Then 

M*K = ¥{\x.M')¥(V)K 
^ v (W[x := ff (V)]) K 
-»„ M'[x := V] * K 

(by Lemmas 2.5 and 2.9) 
= ~N*K. 

Case 2: succcj -+ v c; + i. Then 

M • A' — „ tf'(succ) <Z>(c;) A' — v A (succ cj) -►,, A' c (+ i = iV * A'. 

Case 3: pred c -+„ c . Similar to the previous case. 
Case 4: pred cj+i — ►„ q. Similar to the previous case. 
Case 5: cond c Mi M 2 --►„ M\. Then 

~M*K = cond c (Mi" A) (Ah A) -»„ MT * K 

by Lemma 2.9. 

Case 6: cond q +1 M x M 2 -►„ M 2 . Similar to the previous case. 
Case 7: (/x/.M') -„ M'[f := /x/.Af']. Then 

M-kK = (fif.W)K 

^ v (M'[f := pf.W \) K 
^ v M'[/:=/x/.M']*A 
by Lemmas 2.6 and 2.9. 

In the induction case we consider proofs of length greater than 1, and divide into cases 
depending on the last operational rule used: 

Case 1: B -►„ B' implies cond B M x M 2 -+ v cond B' M t M 2 . Note that B cannot be a 
value; hence if B' is not a value, 

M-kK = B * (Xmjcond m (Afa K) (Ifa K)) 

-»„ 5 7 *(Am.condm(M 1 "Aj(M2 K)) = W * K 
by the induction hypothesis. If B' is a value, then 

M*K -» v cond &(B') (Ml A) (M^ K) = N*K. 

Case 2: P ^ v P' implies succ P -►„ succ P' . P cannot be a value, so if P' is not a 
value, 

M*A = P*(An.(Az.AK.K (succ a;)) n K) 

-»„ P' • (\ii.(\x.\k.k (succ a;)) n K) = N * K 

by the induction hypothesis. If P' is a value, then 

M-kK -»„ (Az.Ak.k (succ a-)) «P(i") A = N*K. 

13 



Case 3: P -►„ P' implies pred P -►„ pred P' . Similar to the previous case. 
Case 4: Q -►„ Q' implies (Az.P) Q -+„ (Xx.P) Q'. Similar to the previous case. 
Case 5: P -> v P' implies P Q ->„ P' Q. P cannot be a value, so if P' is not a value, 

~M * K = P-k(Xm.Q (Xn.m n A)) 

-»„ P 7 *(Am.Q (Arc.mnA')) = N*K 

by the induction hypothesis. If P' is a value and Q is not, then 

M*A -»„ Q (Xn.V(P') 7i A') 

-»„ Q*(Ara.«? r (P / ) n K) = ~N*K. 

by Lemma 2.9. If both P' and Q are values, then 

M*A -»„ Q (An.«F(P') n A") 

-«•„ V(P')V(Q)K = N + K. 

As all operational rules have been considered, we are done. ■ 

These facts about administrative and non-administrative reductions on continuized 
terms give us the ability to prove the following theorem originally due to Fischer [9]: 

Theorem 2.11 (Adequacy) If M is a complete program, then 

Eval v (M) = ci iff Eval v (M (Xx°.x)) = c h 

Proof: (=>) Suppose Eval v (M) = c/; then we know that M -»„ c/. By Lemmas 2.9 and 
2.10 we then have 

~M (Xx.x) -» v M * (Xx.x) -*- v c[ * (Xx.x) -»„ c/. 

Thus, Eval v (M (Xx.x)) = q. 

(<=) Suppose Eval v (M) is not defined. Then 

M -►„ Mi -»•„ Af 2 ->■„.. . 

By Lemmas 2.9 and 2.10, we thus know 

M [Xx.x] -» V ~M * (Xx.x) -»„ MY * ( Xx.x) -» v M 2 * (Xx.x) -»„... 

so Eval v (M (Xx°.x)) is not defined either. ■ 



14 



Chapter 3 

Continuations May Be Unreasonable 



The Adequacy Theorem establishes a strong connection between the evaluation of terms 
and their continuized versions. The theorem easily extends to reasoning about complete 
programs, viz., proving observational congruences. It follows that for complete programs 

M and N, 

M =? oha N iff M (Xx.x) = v obs N (Xx.x). 

The connection between direct and continuized versions of higher-order terms is less obvious, 
but one may still see a partial relationship between reasoning on direct versus reasoning on 
continuized terms: 

Corollary 3.1 IfM = v obs N, then M = v obs N. 

Proof: Suppose M and N were distinguishable by some context C[-}. Then by the Ade- 
quacy Theorem, the context C[] (Xx.x) would distinguish M and N, a contradiction. ■ 

In particular, if one can distinguish two terms by a context, the transforms of those terms 
will also be distinguishable. 

The problem with the continuation transform is that the converse of Corollary 3.1 does 
not hold: observational congruence on direct terms does not coincide with congruence on 
continuized terms. Similar anomalies occur in the other two settings. For example, suppose 
we augment A„ with the call/cc-like operators C and A defined in [7, 8]. Terms that are 
observationally congruent in X v may become distinguishable using contexts containing these 
new operators. In the case of continuation semantics, thfxe are observationally congruent 
terms that are equivalent in a direct semantics but not equivalent in a continuation seman- 
tics. Reasoning principles based on observational congruence may thus become unsound in 
settings involving continuations. 

In the continuation transform setting, the anomaly is manifested at terms of higher 
type. In particular, two higher-order closed terms may be observationally congruent but 
their transforms may not be 

Theorem 3.2 There exist two closed, pure (i.e., containing no constants, conditionals, or 
recursion) terms, namely 

M x = Xx°^ ^ .Xy°-* .Xz .(Xw.x z w) (y z) 
M 2 = Xx°^°-* .Xy°-* .Xz .x z (y z), 

with Mi = v obs M 2 but Mi ^ v obs W 2 . 
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Proof: To show M x = v obs M 2 , we proceed in a purely operational fashion using Theo- 
rem 2.3. 1 We first show that Mi < v M 2 . Pick anyjalues Vi, V 2 , and V3-— then M 2 -»„ V , 
M 2 V x -»„ V" and M 2 Vi V 2 -»„ V" , so all vectors V of length 0, 1, or 2 make the statement 

Mi V -*„ Vq implies M 2 V -*„ V/ 

hold. Now suppose Mi V\ V 2 V 3 -» v c\. Then 

Mi Vi V2 U 3 — „ (Ad.Vi V 3 d) (V 2 V3) -»„ c; 

so it must be the case that V 2 V 3 -»„ V and V a V 3 —„ V" for some values V and V". 
Therefore, 

M 2 Vi F 2 V 3 -„ Vi V 3 (V 2 V 3 ) 

Thus, by Theorem 2.3, M x -< v M 2 . Using a similar argument, one can show M 2 < v M x . 
To show that ~M[ ^ v obs ~M 2 , we first reduce M x and M 2 using =„: 

M7 =„ Xk q .kq (Xx.Xki.ki (Xy.Xn 2 .K 2 (Xz.Xn 3 .y z (Xn.x z (Xm.m n k 3 ))))) 
W 2 = v Xk .k (Ax.Aki.ki (Xy.XK 2 .K 2 (Xz.Xk 3 .x z (Xm.y z (Xn.m n k 3 ))))) 

(where the types have been omitted for clarity.) Intuitively, the difference between Mi and 
M^ comes from a difference in the way Mi and M 2 are reduced when applied to arguments: 
Mi evaluates (y z) first, while M 2 evaluates (x z) first. The typable context 

C[-\ = [■] N , where 

No — Xp.p (Xa.Xb.ci) N\, where 

Ni - Xq.q (Xa.Xb.c 2 ) N 2 , where 

JV 2 = Ar.r ci (Aa.a) 

distinguishes ~M[ and M^, since C[Mi~] terminates with result c 2 and C[M 2 ] terminates with 
result c\\ 

C[M7] =„ (Aa.A6.c2) ci (An.(Aa.A6.ci) c x (Xm.m n (Aa.a))) 

= v C 2 

C[M2] =v (Aa.A6.ci) a (Xm.(Xa.Xb.c 2 ) ci (Xn.m n (Aa.a))) 

= v Cl- 

Thus M7 ^ 6s M^. 2 ■ 

"Using a marked language (c/. Appendix), one can show that the untyped versions of M x 
and M 2 are congruent in any untyped context. Nevertheless, a simple typable context using 
only numerals distinguishes their transforms. 

1 Other techniques exist for verifying congruences: one may rely upon either an adequate or fully- abstract 
denotational semantics or upon an equational system sound for = v obs yet strong enough to prove the congru- 
ence [12, 20]. Either method rests upon a nontrivial adequacy or soundness proof. Plotkin [18] claims both 
methods can be used to prove Mi = v obs M 2 , using either pre-domains [22] or Moggi's A p [16], but I have not 
worked through the proofs of adequacy of the pre-domain semantics or soundness of X p for =" obs - 

2 In fact, a stronger statement is true: M\ -£. v Mi and M2 2> Mi. 

16 



The Adequacy Theorem clarifies why ~M~[ ^ v obs ~M~ 2 . a context with "illegal" continuations 
distinguishes the continuized terms. One could sensibly argue that M\ and M 2 should not 
be distinguished, since the distinguishing context will never arise under the intended uses 
of Ml and ~W 2 . But granting this, the theorem nevertheless points out a legitimate concern: 
what methods shall we use to prove that two terms are congruent with respect to all "legal" 
contexts, and what exactly are the legal contexts? This question might arise if we wanted 
to justify a post-transform code optimization in which transformed code M was replaced 
by an "optimized" expression N equivalent to M in all legal contexts. For any JV , N itself 
need not equal Nq. 

It is not surprising that Theorem 3.2 has an analog in the call/cc setting. Consider, 
for example, the language A c with the call/cc-like operator C and the abort operator A 
[6, 7, 8]. More precisely, A c has the same syntax as the untyped version of X v (i.e., where no 
variables are decorated with types, and terms need not be well- typed), with the additional 
terms C M and A M. The reduction relation for A c , -» c , is defined by the rules 

(AM)N -> e AM (CM)N ^ c C {\k.M (Xtu.k (m N))) 

V (A M) -* c A M V(C M) -> c C (Xk.M (Xv.k (V v))) 

and the outermost computation rules (which are only applicable in empty contexts) 

A M > c M C M > c M (Xx.A x) 

in addition to the (untyped versions of) rules of ->„. Let -» c be the reflexive, transitive 
closure of (-+ c U > c ), and let = c obs denote the observational congruence relation on terms of 
A c when observing numerals. Then 

Theorem 3.3 If M 1 and M 2 are the terms above, M x ^ c obs M 2 . 

Proof: Let C[-] be the context [•] (XxSl) (Xy.C (Ax.ci)) c x . Here, ft is any divergent term 
(such as nf.f.) This context forces C[M 2 ] to diverge but makes C[M t ] converge to c a : 

C[M 1 ] -» c (Xw.(Xx.9,)c 1 w)((Xy.C(Xx.c l ))c l ) 

-» c (Xw.(Xx.Q.) c\ w) (C (Xx.ci)) 

-» c C (Ak.(Ax.ci) (Xv.k, ((Xw.(Xx.Q) c\ w) v))) 

> c (Xk.(Xx.ci) (Xv.k ((Xw.(Xx.il) c\ w) v))) (Xx.A x) 

-» c (\x.ci) (Xv.(Xx.A x) ((Xw.(Xx.Cl) c\ w) v)) 

C[M 2 ] - c ((Xx.Sl) ci) ((Xy£ (Xx. Cl )) Cl ) 
-* c Q ((Xy.C (Az.ci)) ci) 
-» c ft ((Xy.C (Ax.ci)) ci) 



Thus, Mi ^ c ohs M 2 . ■ 

The particular terms M x and M 2 can also be used to point out problems with continu- 
ation semantics. If one bases the semantics of A„ on the transform, i.e. the meaning of a 
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term M is the meaning of M in some well-chosen model, two terms may be observationally 
congruent but fail to be equivalent in the model. The terms Mi and M 2 again provide the 
desired example. 

Less contrived examples appear in the literature. Me\ jr and Sieber, for instance, point 
out that two Algol blocks may be observationally congruent but not congruent if goto 
statements are allowed [14]. Since jumps are usually definable in a continuation semantics, 
the two blocks will not be semantically equivalent. Reasoning principles based on a con- 
tinuation semantics may thus lead one to conclude facts that are not true about the actual 
behavior of code. 

The failure of familiar reasoning principles seems to be known (albeit informally) in 
the community of compiler designers. In the presence of control operators or cps-converted 
code, typical compiler optimizations are unsound and procedure calls are often treated as 
"black holes." But one need not conclude from the failure of some reasoning principles that 
the situation for continuations is a black hole. There are interesting reasoning principles 
which hold in continuation settings. For example, consider the A„ terms 

P x = Xa.Xb.(Xx.x) ((Xy.y) (a b)) 
P 2 — Xa.Xb.{Xx.x) (a b) 

that are not provably equivalent using =„. In A c these two terms are observationally 
congruent, a fact proven by Felleisen [5] who has developed further principles for proving 
observational congruences in this setting. A setting involving continuations seems to require 
a new theory for reasoning about code. Such a theory remains to be found. 
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Chapter 4 

Conclusion 



Reasoning about the behavior of cps-converted code requires additional assumptions if con- 
verted terms are to behave as their direct versions. Theorem 3.2 makes this formal: con- 
tinuations not arising in continuized contexts may distinguish cps-converted terms. Two 
possible approaches for a theory of continuations may be based upon this observation. 

One approach to a theory of continuations attempts to capture the notion of a "legal" 
continuation. An algebraic method along these lines is developed in [15] using retractions. 1 

Definition 4.1 (Informal) A retraction pair (i,j) is a pair of functions such that for 
any x, j (i x) — x. 

Meyer and Wand define retraction pairs (at all types) that, when applied to a continuized 
term, supply the right continuations at the right time. Specifically, in the simply-typed, 
call-by-name A-calculus with no constants (A„, with (3n equational reasoning = n ), Meyer 
and Wand prove 

Theorem 4.2 (Meyer, Wand) For any type a, there exist \ n -definable retraction pairs 
(iaJa) and (I a ,J a ), where i a : a -> a' , j a : a' -+ a, I a ; a' -► ((a' -><>)->■ o), and 
J a : ((a' — >■ o) — y o) -* a', namely 

I n - \x a '.\K a '^°.K x 



l Q 



_ j \x(°^ ^°.x (Xa°.a) ifa-o 

Ja ~ \ \x( a '^°)^ .\b a '.\K T '^°.x (\a a '.a bn) if a = a 

j Xx°.x ifa = o 

ta = \ \x°-+ T .\a°' .I r (t T (a; (j a a))) if a = a -> t 

{ \x°.x if a = o 

\x(°^ T Y .\a° .j r (J T (x (i a a))) if a = a -> r 



Ja = 



Moreover, M = n j a {J a M) for any closed, pure term M. 

By applying the retractions, one can thus recover the meaning of a direct term from its 
continuized form. 2 



inclusive predicates have also been used to establish connections between the direct and continuation 
semantics of a language [24, 26, 29]. The inclusive predicate approach seems necessary in cases where the 
denotational domains are built recursively. 

2 Even in the simplified setting of A„, we cannot expect to _have ~M_= i(M) for any i. This follows because 
there are two pure, closed terms M, N where M =„ N but M and N An-convert^to distinct normal forms, 
namely the terms M_= \<u\b.\c.(\z.a) (b c) and N = Aa.A6.Aca. If h i(M) = M and h i(N) = N, then it 
would follow that \- M = N which, by Statman's typical ambiguity theorem [27], is equationally inconsistent. 
In the case of A„, we similarly cannot have M = i(M) for any i by Theorem 3.2. 
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Theorem 4.2 can be misleading as soon as recursion is added to the language. In the 
pure simply-typed calculus, call-by-name and call-by- value convertibility coincide since no 
term causes a divergent computation [2]. Because call- by- name equational reasoning is not 
sound for the observational congruence theory of \ v , the retraction pairs above may not be 
appropriate for X v . In fact, the retraction pairs are no longer retractions: one can only show 
that M < v J a (I a M) and M < v j a (i a M). We conjecture that a similar reformulation of 
Theorem 4.2 holds. 3 

Conjecture 4.3 For any closed term M of type a, M ^ v j a (J a M). 

This conjecture does not hold if we reverse the < v : 

Theorem 4.4 Let a = (o ->• o -> o -»■ 6) -> (o ->■ o) -> o -> (o -> o) and let 

S = Xx.Xy.Xz.x z (y z) 

be of type a. Then j Q ( J a S) ^ v S. 

Proof: In the proof of Theorem 3.2, we saw that 

5 =„ Xk .k (Xx.XKi.Ki (Xy.XK 2 .K 2 (Xz.Xk 3 .x z (Xm.y z (Xn.m n k 3 ))))) 

Using the fact that (i V) is =„ to a value, we can find a simpler form for j(J S): 

j a (J a S) = v j^Xx.XKi.Ki (Xy.XK 2 .K 2 (Xz.Xk 3 .x z (Xm.y z (Xn.m n k 3 ))))) 

=„ Xai.j(J (Aki.K! (Xy.Xn 2 .K 2 (Xz.Xk 3 .(1 ai) z (Xm.y z (Xn.m n k 3 )))))) 
=„ \a 1 .\a 2 .j(J (Xk 2 .k 2 (Xz.Xk 3 .(i ai) z (Xm.(i a 2 ) z (Xn.m n k 3 ))))) 
= v Xai.Xa 2 .Xa 3 .j(J (Xn 3 .(i a x ) (i a 3 ) (Xm.(i a 2 ) (i a 3 ) (Xn.m n k 3 )))) 
= v Xai.Xa 2 .Xa 3 ,Xa,^. 

jo(Jo (A«.(i a x ) (i a 3 ) (Xm.(i a 2 ) (i a 3 ) (Xn.m n (Xa.a (i a 4 ) «))))) 

Thus, in the typable context 

C[-] = (Xx. Cl ) ([■] (Aa.fi) Vi V 2 ) 

where Vi and V 2 are closed values, C[S] does not halt but C[j(J S)] -» v c\. ■ 

It also remains open whether there is a A^-definable j ,uch that M =° ohs j(M) or even 
whether an interpretation of such a j exists in one of the standard semantical models of X v . 
Another approach to a theory of continuations involves finding general methods for 
proving observational congruences like P x and P 2 . A theory in this spirit might exploit the 
analogy between the three settings of continuation transform, continuation semantics, and 
call/cc-like congruence. W; conjecture that a precise match may be found among them. 

Conjecture 4.5 For appropriate choice of direct semantics D{-}, continuation semantics 
C{-}, continuation transform ~M, and observational congruence relation = c obs using call/cc- 
like operators in contexts, 

M=l hs N iffD(M] = D(N} 
iff C{Mj = C[N] 

iff M =l bs N. 



3 The announcement in [13] of this result is withdrawn. 
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ippppipiwi 



Establishing this conjecture dearly r«f uaet finding & raitably matched triple of transform, 
continuation semantics, and eall/cc-Bhe operators, e.p., we obviously must not try to 
match up a call-by-valne transform with a call-by-name direct semantice of a language with 
call/cc-like operators. 

Developing reliable principles for reasoning about coatJ«aa*ioa» is the ultimate goal of 
this research, and it is unclear (at this time) whica of these tw approaches wffl yield general 
principles. Both avenues are being pursued. 
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Appendix A 

Standard Theorems for the Language 



The appendix is a compendium of some standard facts about the language A„. Similar 
results appear in [2, 19, 20] for call-by-name languages; the techniques for proving these 
facts carry over largely to the case of X v . The results are stated and proved with little 
comment . 

A.l Church-Rosser Theorem 

We follow the proof in [2], using a technique due to Tait and Martin-L6f. 

Definition A.l The relation => p , the parallel reduction relation, is defined inductively 
as follows: 

M => p M predc => p c 

succ cj => p Cj+i pred c J+i => p Cj 

P^ P P' Q^ P Q' 

cond c/ + i P Q => p Q' 

M =^ p M' 





cond cq P < 


Q^ P P' 




B 


=>p B'i P =$*p P ) Q ^p 


Q' 


cond B P Q => p 


cond B' P' 


Q' 




M ^ p M', 


N => P N' 






M N ~-> p 


,M'N' 






M => p M' 





Xx.M => p Xx.M' 

M ^ p M\ N ^ P V 
(Xx.M) N =^ p M'[x := V] 



M => p AT, nf.M =>p N 

nf.M ^ p nf.M' nf.M ^ p M'[f := N] 

Lemma A. 2 If N => p N' and v is any variable, then M[v := N] => p M[v := N']. 

Proof: By structural induction on M. There are two cases to consider in the base case: 

Case 1: M = v; then M[v := N] = N => p N' = M[v := N']. 

Case 2: M = v' for v' some constant or variable not equal to x. Then 

M[v := N] = v' => p v' = M[v := N'}. 

There are six cases in the induction case: 

Case 1: M = Xv.P; then M[v := N] = M => p M = M[v := N']. 
Case 2: M = \iv.P\ then M[v := N] = M => p M = M[v := N'}. 
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Case 3: M - Xy.P. By induction, P[x :- JV] =^ p P[x := TV']; thus, 

M[x := JV] ^ p M[x := JV']. 

Case 4: M = (if .P. Similar to the previous case. 

Case 5: M = cond P x P 2 P3; by induction, P;[a: := JVl => p Pi[x := JV']. Thus, 

M[z := JV] =^ p M[s := JV']. 

Case 6: M = (Pi P 2 ); by induction, P,-[x : = JV] => p P,-[a: := TV'], so 

M[a; := JV] => p M[x := N']. 

This completes the proof. ■ 

Lemma A. 3 Suppose M => p M' and JV => p JV'. If v is a X-variable and JV is a value, then 
M[v := JV] => p M'[v := TV']. If v is a ^-variable, then M[v := JV] => p M'[v := N']. 

Proof: By induction on the definition of M =4> p Af'. In the base case, there are four cases: 

Case 1: Af' = M. By Lemma A.2, M[v := JV] => p M'[w := JV']. 

Case 2: M = succ Cj and AP = c j+1 . Then Af [1; := JV] = M =^ p M' = M> := JV']. 

Case 3: M = pred Co and Afi = Co- Similar to the previous case. 

Case 4: M — pred Cj + \ and M\ = Cj. Similar to the previous case. 

This completes the base case. In the induction case, there are ten cases: 

Case 1: M = cond c P 2 P 3 and M' = P' 2 . By induction, P 2 [v := JV] =*> p P£[v := JV']. 
Thus, Af[« := JV] ^ p M'[u := JV']. 

Case 2: M = cond c; +i P 2 P3 and M' = P3. Similar to the previous case. 

Case 3: M = cond Pi P 2 P 3 and Af' = cond P{ P' 2 P3. By induction, we know that 
Pi[v := JV] =^ p P/[t> := JV']. Thus, M[v := JV] =^ p M'[v := JV']. 

Case 4: Af = Az.P and AP = Az.P'. If v = x, then 

M[v := N} = M => p M' = M'[v := N']. 

If v ^ x, then by induction P[v := JV] => p P> := JV'], so M[v := JV] => p Af' [t; := JV']. 

Case 5: Af = P Q and Af' = P' Q'. Similar to Case 3. 

Case 6: Af = (Xv.P)Q and AP = P> := Q'], where P => p P', Q =^ p Q\ and Q' is 
a value. By the induction hypothesis, Q[v := JV] => p Q'[w := JV']. Also, since t; is a 
A-variable, JV must a value, so Q'[v := JV'] must be i value. We can thus use the rules 
of =^ p : 

M[v := JV] =^ p P> := Q'[v := JV']] = M> := JV']. 

Case 7: M = (Az.P)Q and AP = P'[x := Q'], where v ^ x, P => p P', g =^ p Q', and 
Q' is a value. By the induction hypothesis, P[v :— JV] =j» p P'[u := JV'] and similarly 
for g. If v is a A-variable, then Q'[v := JV'] is a value since JV is a value by hypothesis; 
if v is a /^-variable, Q'[v := JV'] is a value no matter what JV is since Q' is a value. 
Thus, 

M[» := JV] ^ p P> := N'][x := g'[v := JV']] =^ p P'[x := <5'][u := JV'] = JW> := JV']. 
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Case 8: M = fif.P and M' = nf.P'- Similar to Case 4. 

Case 9: M = /iw.P and M' = P> := Q'], where P => p P', M =^ p Q'. Note that t; 
cannot be free in Q', since it is not free in M. Thus, 

M[v :-N] = M =J>p M' = M'[v := JV']. 

Case 10: M = /z/.P and M' = P'[/ := Q'], where f ^ v, P => p P' and M => p g'. By 
induction, M[t> := N] =^ p g'[« := N'] and P[t> := iV] => p P'[v := N'}. Thus, 

M[v := JV] ^ p P'b := iV'][/ := g'[« := N']] = P'[/ := g> := iV'] = M> := N']. 

This completes the proof. I 



Lemma A. 4 The relation => p is Church- Rosser. 

Proof: Suppose M => p M x and M => p M 2 . To show that there is an M 3 with Mi => p M 3 
and M 2 => p M3, proceed by induction on the proof of M => p Mi. In the base case, there 
are four cases: 

Case 1: Mi = M. Pick M3 = M2; this satisfies the conditions. 

Case 2: M = succcj and Mi = Cj+\. Pick M3 = c,-+i; since M 2 can only be M or Mi, 
this choice of M3 suffices. 

Case 3: M = pred c and Mi = c . Pick M 3 = c ; as with the previous case, this M 3 
meets the conditions since M 2 can only be M or Mi . 

Case 4: M = pred Cj+i and Mi = Cj. Pick M3 = cj; again, this choice suffices. 

This completes the base case. In the induction case, there are eight cases to consider: 

Case 1: M = cond c P 2 P3 and Mi = P' 2 . Then M 2 is either P'{ or cond c P^' P3'. By 
the induction hypothesis, there is a P 2 " with P 2 => p P 2 " and P 2 => p P-J". Then picking 
M 3 to be Pj" works. 

Case 2: M = cond q + i P 2 P 3 and M' = P 3 . Similar to the previous case. 

Case 3: M = cond Pi P 2 P 3 and Mi = cond P[ P' 2 P^, where P; => p P/. Then M 2 
is either P 2 , P 3 ', or cond P" P 2 P 3 . By the induction hypothesis, there are P-" with 
Pf ^ p P!" and P/' => p P/". Then picking M 3 to be either P»' , i*", or cond PJ" Pf Pf 
(as appropriate) works. 

Case 4: M = Az.P and M x = Air.P', where P =>- p P'. Then M 2 must also be of 
the form Xx.P". By induction, pick P'" where P' => p P'" and P" => p P'". Then 
M 3 = Az.P'" will work. 

Case 5: M = (Xx.P)Q and Mj = P'[x := Q'], where P => p P', g => p g', and Q' is a 
value. There are two subcases: 

Subcase i: M 2 = (Az.P")g". By induction, there are P'" and g"' with P' =^ p P'" 
and P" =^ p P w ; pick Q'" similarly. Since Q' is a value, Q'" must also be a value. 
Pick M 3 = P'"[x := Q"'}\ M 2 => p M 3 easily, and M a =^ p M 3 by Lemma A. 3. 
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Subcase ii: M 2 = P"[x :— Q"]. By induction, there are two terms P'" and Q'" 
with P' =^ p P'" and P" => p P'"; pick Q'" similarly. Since Q' is a value, Q'" must 
also be a value. Pick M 3 = P"'[x := Q'"]; then both M 2 => p M 3 and M x =^ p M 3 
by Lemma A. 3. 

Case 6: M = P Q and Mi = P' Q', where P =» p P' and g =^ p g'. There are two 
subcases: 

Subcase i: M 2 = P" g". By induction, pick P'" with P' ^ p P'" and P" =^ p P'"; 
pick g'" similarly. Then M 3 = P"' Q'" works. 

Subcase ii: P = Xx.R and M 2 = P"[a: := Q"], for Q" a value. Then P' = Az.P'. 
By induction, pick R'" with R' =^ p R'" and P" =^ p P'"; pick g"' similarly. As 
above, note that Q'" must be a value. Picking M3 to be R"'[x := Q'"] works, 
since Mi => p M3 easily and M 2 => p M3 by Lemma A. 3. 

Case 7: M = (/x/.P) and Mi = ^/.P'. 

Subcase i: M 2 = fif.P". By induction, pick P'" as before; then M3 = fif.P'" 
works. 

Subcase ii: M 2 = P"[/ := Q"}, where P => p P" and M => p Q" . By induction, 
pick P'" as before, and let M 3 = P'"[/ := Q"]; Mi => p M 3 by the rules of => p , 
and M 2 => p M3 by Lemma A. 3. 

Case 8: M = (/x/.P) and M x = P'[f := Q'], where P => p P' and M =^ p g'. 

Subcase i: M 2 = fif.P". By induction, pick P'" as before and pick Q'" where 
Q' => p Q"' and M 2 =>> p Q"> '. Then M 3 = P'"[f := g'"] works, since M : => p M 3 
by Lemma A. 3 and M 2 =>- p M 3 by the rules of => p . 

Subcase ii: M 2 = P"[/ := Q"], where P =$> p P" and M => p g". By induction, 
pick P'" and Q'" as before, and let M 3 = P'"[/ := g'"]; then Mi => p M 3 and 
M 2 =^ p M 3 by Lemma A. 3. 



Definition A. 5 M =>„ N iff M = v N using no instance of the symmetry axiom. 
Lemma A.6 M =^ P N iff M => v N. 

Proof: Let = v be the relation of doing or 1 -„ steps without using the symmetry axiom. 
When treated as sets, the relations satisfy 

~^" C ^ p C => v . 

Since =>•„ is the transitive closure of = v , it is also the transitive closure of =4> p . ■ 



Theorem A. 7 The relation => v is Church- Rosser. 

Proof: Since => p is Church-Rosser, its transitive closure =>* is also [2]. By Lemma A.6, 
=$> v is Church-Rosser. ■ 
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The most important consequence of the Church- Rosser theorem is 

Theorem 2.2 If M = v N, then M = v obs N. 

Proof: If M => v N, then C[M] ^ v C[N]. Thus, if either C[M] or C[N] reduce to c, under 
-»>■„, both of them will by Theorem A. 7. The theorem then follows by an easy induction on 
the number of occurrences of the symmetry rule. ■ 



A. 2 Applicative Congruence 

At each step, the relation —* v reduces only one subterm. We call that subterm the active 
subterm [20]. An examination of the operational rules indicates that 

Definition A. 8 The active subterm of a non-value, closed term M is 

• M if M is of the form (succc/), (predc;), (cond c\ M Mi), (fxf.Mo), or ((Xx.M ) V) 
for V a value; or 

• The active subterm in M ' , where M' is closed and not a value, if M has the form 
(succ M' ), (pred M' ), (cond M' M Mi), (M' M ), or ((Xx.M ) M'). 

This definition matches the informal description of what the active subterm should be: 

Lemma A. 9 Let M be a closed subterm of a non-value, closed term C[M], where M 
contains the active subterm of C[M] and C[-] has only one hole. Then if M —> v M' , 

C[M] ^ v C[M']. 

Proof: An easy structural induction on C[-]. ■ 



Lemma A. 10 Let M be a closed subterm of a non-value, closed term C[M], where M 
contains the active subterm of C[M] and C[-] has only one hole. Then if M -» v M', 

C[M] -»„ C[M']. 

Proof: By induction on n, where 

M = M -*„ Mi ->„ M 2 -> v . . . ->„ M n = M'. 

The base case, where n = 0, is trivial, so we proceed to the induction case. By the induction 
hypothesis, C[Mo] -» v C[M n _i]. A structural induction on C[-] shows that M n _i contains 
the active subterm in C[M n _i]; thus, by Lemma A. 9, C[Af n _i] — >■„ C[M n ] so the lemma 
holds. ■ 



Lemma A. 11 (Activity) Let M be a closed term of type a and C[-] be a closed context 
with holes of type a. Then C[M] -» v c; iff either 

1. C[M'] -* v C} for any M'; or 

2. (Xx.C[x}) M -»„ q. 
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Proof: (=>■) If M is a value, condition (2) holds immediately. So suppose M is not a value. 
We use a marking technique due to Bard Bloom. Add to the language X v the term #M for 
any term M, and add the reduction rules (and only these rules) 

#y -►„ V, V a value 

M -^y N 

#M ^ v #N 

to the definition of the — * v relation. Note that these rules do not change the computational 
behavior of A„, i.e., M -» v c\ iff erase(M) -»„ c; where erase(M) is the result of erasing all 
marks in M . 

Proceed by induction on the number of occurrences of #M in C[#M]. The base case 
(n = 0) is trivial. In the induction case, suppose that C[#M] -» v C{. Let C be the first 
term whose active subterm is contained in a subterm of #M; if there is no such C, then 
condition (1) holds. Let C = D[#M], where D[-] has one hole. Since #M -»„ V for 
some unmarked value V, using the version of Lemma A. 10 for the marked language we 
conclude that C -» v D[V]. Note that there is a context E[-] with n — 1 holes such that 
D[V] = E[#M}. The context E[-\ has the property that 

i\x.C[x}) M -» v (Xx.C[x]) V 
^ v C[V] = E[V]. 

By the induction hypothesis, either E[M'\ -» v c\ for any M' or (Xx.E[x]) M -» v c\. If the 
first condition is true, then [\x.C\x\) M -»„ E[V] -»„ c/ so (\x.C[x]) M -»„ c\. If the second 
condition is true, then (\x.C[x]) M -»„ E[V] -» v ci since (Xx.E[x]) M -» v E[V] -» v c;. 

(<=) Suppose (Aa;.C[2:]) (#M) -»„ c;. Again, proceed by induction on the number of 
occurrences of #M in C[#M]. The base case (n = 0) is trivial, so consider the induction 
case. Examine the reduction sequence for C[#M], and pick the first C whose active 
subterm is contained in a #M; if there is no such C", then C[M'] -» v c\ for any M' so 
C[M] -»„ c/. Let C = D[#M], where D[-] is a context with one hole and #M contains the 
active subterm in D[#M], Then 

D[#M] -*„ D[V] = E[#M] 

where V is a value with #M -»■„ V and £[■] is an unmarked context with n — 1 holes. Since 

(Az.^z]) (#M) -»„ q 

by the induction hypothesis ,e[#M] -*„ c,. Since C[#M] -»„ f?[#M], C[#Af] -•„ c/. ■ 



Lemma A. 12 Zei Vq and V\ be closed values of the same type. IJVq V < v V\ V for any 
closed value V, then Vq -< v V\. 

Proof: Again, we use the marking technique. Suppose C[#V ] -» v c\ assuming, without 
loss of generality, that C[-\ contains no marked terms. We proceed by induction on n, 
where an active subterm of the form ((#V ) V) (V any closed value) appears n times in 
the reduction. 



27 



In the base case, n = 0; thus, C[V~i] -»■„ c; trivially. In the induction case, pick the 
first term C in the reduction sequence with an active subterm of the form ((#Vq) V). 
Let C — jD[(#V ) V], where D[-] has one hole and the hole is active. We know that 
D[(#V ) V] -»„ c\. By hypothesis, D\V\ V] -»„ c\. Let E[-] be the context where 
D[Vi V] -*„ £[#F ] and £[•] has no occurrences of #V . Since E[#V ] -»„ Q with (n - 1) 
reductions of the form ((#Vo) V") for some closed value V", by induction we conclude that 
E[Vi] -»„ c;. The lemma now follows since C[V\] -»„ £[Vi] -»„ c;. ■ 



Theorem 2.3 Lei M and N be closed terms of the same type. Then M < v N iff, for all 
vectors V of closed values, 

M V -»„ Vq implies N V -» v V{ and Vq — V{ if either is a numeral. 

Proof: (=>) Trivial. 

(4=) By induction on types. Consider first the base case, where M and N are of type o. 
Suppose C[-] is a context in which C[M] -* v q; then we know by the Activity Lemma that 
either C[M'] -» v c\ for any M' or (Xx.C[x]) M -» v c\. In the first case, C[N] -» v c\ trivially. 
In the second case, since M must reduce to some numeral, say c/<, it must be the case that 
JV -»„ c//. Thus, C[N] -»„ ci, so M < v N. 

In the induction case, again consider any C[-] where C[M] -» v ci. Then by the Activity 
Lemma, either C[M'] -*►„ c; for any M' or (Ax.C[x]) M -» v c/. In the first case, C[N] -»„ c\ 
trivially. In the second case, M -» v Vo for some closed value Vq. Since for any vector V of 
closed values, 

Aff-», Vq implies N V -» v V{, 

it follows (using the empty vector) that N -» v V\ for some closed value Vi. By hypothesis, 
for any closed value V, 

(M V) V -»„ c, implies (TV V') V -»„ c,. 

By the induction hypothesis, M V < v JV V for any V. By Lemma A. 12, since M = v ohs V 
and JV =l bs V x , M ± V N. ■ 
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